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Abstract 



The importance of understanding the mechanism of protein aggregation into 
insoluble amyloid fibrils relies not only on its medical consequences, but also on 
its more basic properties of self-organization. The discovery that a large number 
of uncorrelated proteins can form, under proper conditions, structurally simi- 
lar fibrils has suggested that the underlying mechanism is a general feature of 
polypeptide chains. In the present work, we address the early events preceeding 
amyloid fibril formation in solutions of zinc free human insulin incubated at low 
pH and high temperature. Aside from being a easy-to-handle model for pro- 
tein fibrillation, subcutaneous aggregation of insulin after injection is a nuisance 
which affects patients with diabetes. Here, we show by time— lapse atomic force 
microscopy (AFM) that a steady-state distribution of protein oligomers with an 
exponential tail is reached within few minutes after heating. This metastable 
phase lasts for few hours until aggregation into fibrils suddenly occurs. A the- 
oretical explanation of the oligomer pre— fibrillar distribution is given in terms 
of a simple coagulation-evaporation kinetic model, in which concentration plays 
the role of a critical parameter. Due to high resolution and sensitivity of AFM 
technique, the observation of a long-lasting latency time should be considered 
an actual feature of the aggregation process, and not simply ascribed to instru- 
mental inefficency. These experimental facts, along with the kinetic model used, 
claim for a critical role of thermal concentration fluctuations in the process of 
fibril nucleation. 
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Introduction 



Self-assembly of proteins or peptides into linear elongated structures known as amyloid 
fibrils is a conserved feature accompaning the clinical manifestation of many pathologies, 
such as Systemic Amyloidosis or several neurodegenerative diseases (Alzheimer's disease, 
Transmissible Spongyform Encephalopathy, etc.) In several cases, fibril formation is 
regarded as the onset and the cause of such diseases ('Amyloid Hypothesis") |2j. More 
in general, a large number of uncorrelated proteins share the possibility to assemble into 
similar fibrillar structures under appropriate conditions, that typically favour non native 



conformations |3j, |4|. Therefore, the study of fibrillation kinetics is important in order to 
understand the processes and the interactions involved in amyloid self-assembly and to design 
molecular inhibitors. 

The 51-residue hormone insulin has long been known to form fibrils if heated at low 



pH 



that is when monomeric or dimeric forms are promoted 



Indeed, insulin 



is protected from fibrillation by assembling into Zn-hexamers during in vivo storage or in 



artificially delivery 



systems 



i by a 

m 



In acidic condition, insulin aggregation proceeds mainly 



via three steps [ll|, |l2j, [13|: formation of active centers (nucleation), elongation of these 
centers to fibrils (growth), and floccule formatio n Il4ll . This is a typical scheme for protein 



polymerization [15[ or amyloid formation 



16, 



171 llq . More recenty, the structure of insulin 



fibril has been shown to resemble that of typical amyloid fibrils with the characteristic cross- (5 



structure 



0,0, El, 



221. 



In order to understand the molecular mechanism which is responsible for the uprise of 
fibrils, it is necessary to get insight into the early stages of the process. Observation of 
partially folded intermediate conformations in conditions preceeding insulin fibril formation 



23, 



24 
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yet the 



provided a molecular insight of the interactions involved 
onset of aggregation and the causes leading to fibril nucleation and elongation are not clearly 
understood. 

The early stages of fibrillogenesis are, in general, difficult to investigate, due to the 
inherent instability of such systems. Quenching the incubating solution to low temperature 
allows to perform molecular weight filtering and circular dichroism experiments 29(, but 
the information one obtains concerns conditions different from the incubating ones. Light 
scattering [l^, 1^ is particularly suited to detect either large supramolecular aggregates 
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or protein size objects at sufficiently high mass concentration, and consequently it misses 
the early events in fibrillation kinetics. Neutron scattering has been used to detect small 
fibrillar precursors [31], but needs long measurements, and thus one can obtain but time- 
averaged quantities. Atomic force microscopy (AFM) is a technique able to detect fine- 
grained features of samples deposited on a substrate (the resolution corresponding to the 
inverse curvature of the tip, that is ~ 5 nm). Time-lapse AFM have been extensively used 
to observe the stucture and growth of amyloid fibrils. 

In the present work, we performed AFM experiments during fibrillation of human insulin. 
In particular, we focus on the early stages preceeding the observation of mature fibrils. In 
order to explore with sufficient time-resolution the lag phase, we used zinc-free recombinant 
human insulin, since in this case fibril formation takes place on the time scale of hours ^, |^ , 

"3, 40 1 . AFMsnap- 
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and it is slower than that of the best studied bovine insulin [35L 
shots at different times show a distribution of ellipsoidal oligomeric aggregates, consistent 
with analogous finding in other amyloidogenic systems, as the Alzheimer's amyloid-/? (1-40) 



peptide (4JJ, 



! J 0. Li or other proteins [15. 1(5]. After i hours of incubation, ellipsoidal 







protein oligomers disappear from AFM images and amyloid fibrils of different leng th are 
detected, with a structure analogous to that observed for bovin insulin fibers 2^ Q| • Such 
abrupt change in aggregate distribution and shape occurs within the experiment time reso- 
lution that is 30 minutes. 

A main result obtained from our experiments is that the oligomer distribution is sta- 
tionary during the lag-phase and it exhibits an exponential tail. The median values of this 
distribution are consistent with electro-spray mass-spectrometry experiments performed on 
bovin insulin in analogous conditions :19|, but also larger oligomer, up to several tens, are 
involved. This metastable phase can be explained by a coagulation-evaporation process that 
has been proposed for colloidal aggregation j^. As to this model, the existence of a sta- 
tionary oligomer distribution is critically controlled by protein concentration. Consequently, 
small local concentration fluctuations are enough to make the system cross to the dynamical 
phase characterized by large "elongated" growing clusters. 



Our wor 
studies 50 



t is thus in harmony with experimental observations 



3,0 



and theoretical 



5jJ of protein clusters distributions in conditions promoting protein crystalliza- 
tion. Now, the present results shed a new light into the current view of fibril nucleation, 
assigning a relevant role to thermal fluctuations and to protein-protein interactions leading 
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to cluster formation rather than to physical fibrillar precursors. 
Materials and Methods 

Sample preparation. Recombinant human insulin powder (purchased from Sigma 
Chemical Co. and used without further purification) was directly dissolved at 5 °C in buffer 
solution (50 mM KC1/HC1 in Millipore SuperQ water, pH 1.6 at 60 °C). The protein solution 
was gently stirred, filtered through 0.22 mm Millex-GV (Millipore) filter into glass cells, and 
incubated at 60 °C. Insulin concentration was 200 /iM as measured by UV absorption at 
276 nm using an extinction coefficient of 1.0675 for 1.0 mg/ml. The final concentrations 
were consistent with those calculated by weigthing insulin powder, thus confirming that 
essentially no material was lost through filtering and that insulin was efficiently dissolved. 
After given time intervals 10 /d of incubated protein solution were diluted into 1 ml buffer 
solution, quenched to °C to rapidly inhibit further aggregation, and used for atomic force 
microscopy experiments. All chemicals were analytical grade. 

Atomic force microscopy (AFM). A few /A of the insulin solution were dropped onto 
a freshly cleaved mica substrate (quality ruby muscovite). After few minutes, the sample 
was washed dropwise with Millipore SuperQ water, and then dried with a gentle stream of 
dry nitrogen. Images of the protein aggregates were recorded with a Multimode Nanoscope 
Ilia AFM (Veeco Instruments, Santa Barbara, CA, USA), operating in Tapping Mode inside 
a sealed box where a dry nitrogen atmosphere was maintained. We used rigid cantilevers 
with resonance frequencies of about 300 kHz, and equipped with single crystal silicon tips 
with nominal radius of curvature 5-10 nm. Typical scan size was 500x500 nm 2 (512x512 
points), and scan rate 1-2 Hz. 

Static Light Scattering. Immediately after preparation, samples were placed in a ther- 
mostated cell compartment of a Brookhaven Instruments BI200-SM goniometer, equipped 
with a 100 mW Ar laser tuned at A = 514.5 nm. The temperature was set at 60 °C and 
controlled within 0.05 °C with a thermostated recirculated bath. Scattered light intensity at 
90° was measured by using a Brookhaven BI-9000 correlator. Absolute values for scattered 
intensity (Rayleigh ratio) have been obtained by normalization with respect to Toluene, 
whose Rayleigh ratio at 514.5 nm was taken as 32 • 10~ 6 cm~ 1 . 
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Results 



Time— resolved AFM. Our procedure to investigate early stages of insulin fibrillation 
consists in incubating the protein in a test tube, extracting samples every 30 minutes, 
depositing on a substrate and scanning it with the AFM (which takes a time of the order 
of minutes). Thus, we obtain snapshots of the aggregation intermediates until fibrils are 
formed. Several AFM images of each sample, representative of a given incubation time, 
were recorded. This allowed collecting the topographic data of about 10 4 aggregates for each 
incubation time. Snapshots of the system from the beginning of the incubation (defined as 
time zero) up to nine hours are displayed in Fig. [TJ Such snapshots indicate that there are 
oligomers, but not fibril-like structures (cf. Fig. [IJA.-C), in the first four hours, until fibrils 
suddenly appear at time 280 min. (cf. Fig. HP). The overall process can thus be divided 
into a long metastable phase, a nucleation event and the growth of the fibrils. Note that 
the growth phase is much faster than the metastable phase, the fibrils having incorporated 
all oligomers within the time resolution of the experiment, that is 30 minutes. 

AFM data analysis in the early stages of kinetics. An home made software was 
used for detecting the edges of the protein aggregates in the AFM maps [M. Marino, A. 
Podestd, P. Piseri et al..., unpublished]. The binary maps obtained were then processed 
using the Image Processing Toolbox of Matlab (The Mathworks, Inc.) and the average 
distributions of aggregate areas were obtained, as shown in Fig. 

Deconvolution of the tip shape from AFM images is a critical issue in any quantitative 
study of biological samples. Deconvolution algorithms are likely to introduce artefacts in the 
data, expecially when the basic features in the AFM maps are nanometer sized. Moreover, 
the morphology of our system, a quasi two-dimensional close arrangement of nanometer sized 
objects, without gaps in between, does not permit to apply simple deconvolution formula 
to the distribution of areas [52j. These formula apply to the case of parabolic-spherical tips 
scanning isolated objects lying on a flat reference plane. 

We have thus processed raw AFM images without applying any deconvolution. We 
expect indeed reduced convolution effects, because the tip does not penetrate deeply down 
to the substrate, but only sense the outmost surface of the protein layer. This insures only 
negligible lateral contact of the tip and consequently reduced loss of resolution. In addition, 
the underestimation of the area of the aggregates caused by the erosion of binary maps 
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operated by the edge detection algorithm tends to compensate the opposite effect produced 
by the tip shape convolution. 

To show that the effects of tip convolution are negligible, we analyzed several AFM images 
of highly diluted samples, where isolated aggregates lying on the flat mica surface are visible 
(about 20 complexes every 500x500 nm 2 ). These model samples were pre-processed using 
standard deconvolution algorithms; we used the formula w' = w — 2\J~(2hRti P ) - parabolic 
tip on a step - where R tip is the tip radius (assumed R t i P 3 nm), h is the step height (h 1.1 
nm, the average aggregate height extracted by the AFM images), w and w' are the apparent 
and deconvoluted widths of the observed features [^. The resulting distribution of areas 
were in good agreement with those obtained from the non-deconvoluted AFM images (data 
not shown). In particular, the median and standard deviation of the areas were 27 ± 35 
nm 2 accordingly, to be compared with the average values of 30 ± 22 nm 2 , extracted from 
the raw AFM images of concentrated samples. 

Quantitative estimation of aggregate size from areas rather than from heights is more 
reliable because the peculiar vertical interaction of the AFM tip with biological samples 
usually leads to underestimation of the true height. The same effect is caused by the close 
packing of insuline aggregates in relatively concentrated samples, which keeps the tip from 
getting in touch with the flat reference substrate. Processing AFM images of concentrated 
samples, however, allowed collecting a large statistics, required to have a stable fit of the 
area distributions. 

Shape of oligomers. The shape of the aggregates can be characterized by mean of 
their eccentricity. The eccentricity of the protein aggregates was also evaluated from the 
binary maps using the same Matlab toolbox. Eccentricity is defined as ^1 — (a/b) 2 , a and 
b being the minor and major axis, accordingly. This parameter is expected to be for a 
circle, and 1 for a segment. Correlations of eccentricity and areas are shown in Fig. El which 
show that also this feature of the system is stationary in the metastable phase. Aggregates 
have a mean eccentricity of 0.75, that stands for a ratio between large and small axis of 
about 1.5. Larger size aggregates have a larger eccentricity than smaller aggregates, thus 
evidencing a preferencial unidimensional (fibrillar) growth for clustering proteins, consistent 
with recent theoretical findings on colloid clusters with both short range attraction and long 
range repulsion 0, |si | . 

Because of the rounding effect of tip convolution, the measured eccentricity is, at most, 
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an underestimate of the actual one. 

Average mass of insulin oligomers at the onset of kinetics. Light scatter- 
ing experiments were performed immediately after incubation at 60 °C. Measurement 
of the intensity scattered at 90° (scattering vector q = 23yum _1 ) provides the Rayleigh 
ratio In{q) that is related to the weight average molecular mass M w by the relation: 
I R (q) = AmWidn/dcf^N^cMuPziq), with c mass concentration, n medium refractive 
index, Ao incident wavelength, Na Avogadro's number, and P z {q) z-averaged form factor 
[3^. By taking (dn/dc) = 0.18 cm 3 g _1 , and P z {q) = 1 (since the initial size of solutes is 
much smaller than we obtain an average molecular mass of 23±5 kDa. Considering 

that the molecular mass of a single insulin molecule is 5806 Da, the soluble oligomers found 
at the onset of kinetics are made up of about 4±1 insulin molecules. Note, however, that 
the mean aggregation number obtained by light scattering measurements corresponds to the 
ratio between the second and the first moment of oligomer distribution [s(3] , and it gives no 
information on the actual distribution shape. 

Oligomer distribution preceeding amyloid formation. Volumes of imaged objects 
were derived from calculated areas and eccentricities, under the assumption that the ag- 
gregates are prolate ellipsoids. Aggregation numbers n are obtained by using the relation 
V = Von 1 / 11 , where V = 14. lnm 3 is the van der Waals volume of an insulin monomer, includ- 
ing a layer of water, derived from the x-ray structure 53], and <i=2.68 is an effective fractal 
dimension that accounts for the scaling between mass and size of aggregates. The value 
d=2.68 is derived from x-rays and light scattering data on oligomers of zinc-free 

insulin at high pH. Note that zinc-free insulin is not tightly packed nor it is assembled into 
toroidal shaped hexamers as zinc insulin. We have checked that by assuming an effective 
fractal dimensions between 2 (enough loose aggregates) and 3 (space filling objects), the 
shape of oligomer distribution is not significantly altered, that is the distribution shape is 
robust with respect to different reasonables choice of molecular packing. This distribution 
implies that aggregates built out of up to 50 monomers are detectable in the initial stages of 
aggregation. The large size of these oligomers is in agreement with the micellar precursors 
identified in ref. in the case of Alzheimer's amyloid-/? peptide. 

The distribution of oligomer aggregation numbers at different times in the metastable 
phase is displayed in Fig. I2P-F. All the curves well overlap indicating that the distribution 
of size is stationary. The median aggregation numbers n m are 5.9, 4.9, and 6.7 respectively 
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for the three cases shown in the figure. The tail of such distributions can be fit by an 
exponential of the kind exp(— n/n m ) (cf. Fig. |2J, where n m is the median aggregation 
number. 

Discussion 

Kinetic model for oligomer distribution. The most evident feature of the distribu- 
tions of oligomer size and aggregation number shown in Fig. |2]is that they reach a steady 
state within the time detectable from the experiment (i.e. few minutes). A steady state 
means that, unlike diffusion-limited or reaction-limited mechanisms which regulate the as- 
semb , of larger abates Q. .a the present ease we deal wita an "evaporation" proces8 
(i.e., monomers leaving the aggregates) which competes with "coagulation". 

A mechanism for protein association which account for both aggregation and evaporation 
processes can be outlined in the framework of classical coagulation theory [56|. If we call 
p n (t) the number concentration of aggregates built out of n monomers at time t, the rate 
equation of the system reads: 

i+j=n j 

+X n +lPn+l(t) ~ XnPn(t) + 6 n> i ^ XjPj(t) (1) 

j 

where dotted quantities refer to time derivatives. The first two terms in the right hand side 
of the latter equation are respectively the production and loss of n-mers by coagulation of 
two clusters of i and j proteins, while the other terms describe the "evaporation" of one 
monomer from a cluster of n + 1 proteins into a cluster of n proteins and a single protein. 
Here, we are including no nucleation term, and we are also assuming that three-body effects 
can be neglected. 

The simplest solution of such equations has been provided by Krapivsky and Redner 
by taking mass independent rate costants, Kij = K and Aj = A, and assuming that only 
monomers are present at time zero, that is p n (0) = cMq S U; i, where c is the total mass 
concentration and M is the mass of a monomer. 

The model displays two behaviours, controlled by the parameter p = i^A _1 cM f 7 1 , that is 
by the ratio between the coagulation and the evaporation rate constants and by the initial 
concentration of monomers. At low protein concentration (p < 1) the system displays a 
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steady state distribution P{n) = p n /T,p n with an asymptotic exponential tail: 

r (""5) 



Pin) = x 



n-1 



n — \ 

fx 

n + 1 



(2) 



r(n + i)r(i) 

where x = /i(2 — /x). At /x — 1, the system display a power-law distribution, while at higher 
concentrations (/x > 1) it does not display any steady state, the typical cluster growing 
linearly in time. 

The tails of the distributions shown in Fig. |2jD-F are well fit by equation indicating 
that the system is in the low-concentration regime. For the three ditributions one obtains 
respectively /i= 0.71, 0.66, 0.73. The mass averaged mean aggregation number n z , which 
is accessible through scattering experiments and is found to be 4 ± 1, can be expressed in 
terms of the present model as the ratio between the second and the first moment of the 
distribution: n z = 1/(1 — /x) = 3.3 ± 0.4. 

Due to the large value observed for the parameter /i, one could speculate that local 
fluctuations in the density of monomers could be the triggering mechanism behind the onset 
of fibril formation, akin to what proposed for crystal nucleation ^3, 13- Note that this 
does not imply a symmetry breaking, since the metastable aggregates already display a 
pronounced eccentricity. 

The kinetic model used need no assumption concerning thermodynamic equilibrium. 
Notwithstanding, it is interesting to consider the free-energy change involved in the clus- 
tering process if one assumes a "metastable" equilibrium condition. In particular, one can 
define the free energy AG n associated with the addition of one monomer to a cluster of 
n proteins, as: AG n = — ^sTln(/ n+1 // n / 1 ), where k B is the Boltzmann constant and /„ 
is the activity of a cluster of n proteins. If we take the activity as = c n /c, with c n 
mass concnetration of the n-mers and c total concentration, we obtain for an infinitely large 
cluster: 

AGoo = ln 2/x 

k B T n i_I At (2- /U ) 1 > 

^From our analysis we obtained AGqo = — 0.6/c.bT. Therefore, the free-energy related to 

the growth of a large cluster or fiber is easely accessible through a thermal fluctuations. 

This gives a rationale for the fact that in insulin as well as in other protein solution a 
change in temperature or in solvent conditions can trigger fibril formation [sol ]. 

Conclusive remarks. In the present work, the early stages of human insulin fibrillation 
have been monitored by time-lapse AFM, a techniques with high resolution and sensitivity. 

10 



Experimental observations and theoretical modeling highlight an interesting scenario of the 
nucleation mechanism preceeding amyloid fibrillation, i) Experiments show that a steady- 
state distribution of protein oligomers with an exponential tail is present in solution up to 
the abrupt formation of amyloid fibrils (Fig. |I]and|2J). ii) Oligomer distribution can be 
explained by a kinetic model that combines coagulation and evaporation events (Fig. |2p- 
F). As to this model, the formation of "non-stationary", growing aggregates is controlled 
by monomer concentration. In the present case, concentration is below the critical value, 
yet sufficiently high to allow "above-threshold" thermal concentration fluctuations, iii) 
Pre-fibrillar oligomers exhibit a marked eccentricity (Fig. OJ), denoting that the symmetry- 
breaking implied by the existence of fibrillar aggregates is already occurred before fibrillation. 
Indeed, it is reasonably related to a "fast" conformational change 21, 3, 13- The existence 



of prefibrillar precursor acting as aggregation nuclei has been widely observed in amyloid 
formation both as pre-existing seeds and as actual self-assembled nuclei The present 
results point out that along with the existence of such precursors local density fluctuations 
may play a critical role in the nucleation mechanism and trigger amyloid fibrillogenesis. 
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Figure captions 




FIG. 1: Snapshots of insulin aggregation kinetics at 60 °C monitored by kinetic AFM. Times 
elapsed after incubation: (A) 1 min. (B) 180 min. (C) 250 min. (D) 540 min. The vertical color 
scale is (A)-(C) 5 nm, and (D) 30 nm 
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FIG. 2: Oligomer distributions in the course of kinetics. (A)-(C) Counts of areas observed in AFM 
images of figures respectively. (D)-(F) Frequency of occurrence of aggregation numbers of 

objects observed in AFM images of figures ^\-C respectively. Solid lines are fit by expression [2 
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FIG. 3: Size — eccentricity correlation. (A)-(C) Correlation of eccentricity and areas of objects 
observed in AFM images of figures ^V~C respectively. Dotted curves represent average eccentricity 
versus aggregate area 
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